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[57] ABSTRACT 

Various systems and methods of scheduling media segments 
of varying display rate, length and/or periodicity on at least 
one clustered, vertically-striped or horizontally-striped con- 
tinuous media database volume. With respect to the at least 
one horizontally-striped database volume, one method 
includes the steps of: (1) associating a display value with 
each of the media segments, (2) sorting the media segments 
in a non-increasing order of value density to obtain an 
ordered list thereof and (3) building a scheduling tree of the 
media segments, the scheduling tree having a structure that 
increases a total display value of the media segments. 

39 Claims, 11 Drawing Sheets 
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SCHEDULING RESOURCES FOR 
CONTINUOUS MEDIA DATABASES 

TECHNICAL FIELD OF THE INVENTION 

The present invention is directed, in general, to 
continuous-media-on-demand ("CMOD") services and, 
more specifically, to systems and methods for increasing the 
performance of databases that provide CMOD services 
(so-called "continuous media databases"). 

BACKGROUND OF THE INVENTION 

In recent years, significant advances in both networking 
technology and technologies involving the digitization and 
compression of continuous media data (e.g., video and audio 
data) have taken place. For example, it is now possible to 
transmit several gigabytes of data per second over fiber optic 
networks. With compression standards such as Motion Pic- 
ture Experts Group ("MPEG")-1, the bandwidth required for 
transmitting video has become relatively low. These 
advances have resulted in a host of new applications involv- 
ing the transmission of media over communications and 
networks, such as Enhanced Pay-Per-View ("EPPV"), 
video-on-demand ("VOD"), on-line tutorials and interactive 
television. Continuous-media-on-demand ("CMOD") serv- 
ers are one of the key components necessary to provide the 
above applications. Depending on the application, the con- 
tinuous media servers may be required to store hundreds of 
media segment programs and concurrently transmit continu- 
ous media data to a few hundred clients. The transmission 
rate for such data is typically a given rate contingent upon 
the media type and the compression technique employed by 
the continuous media server. For example, the transmission 
rate for MPEG-1 is approximately 1.5 Mbps. 

Continuous media ("CM") data segments, for example 
movies and other on-demand programming, are transmitted 
from random access memory ("RAM") in the CM server to 
the clients. However, due to the voluminous nature of media 
segment data (e.g., 100 minute long MPEG-1 video requires 
approximately 1.125 GB of storage space) and the relatively 
high cost of RAM, storing media segments in RAM is 
prohibitively expensive. A cost effective alternative manner 
for storing media segments on a CM server involves using 
magnetic or optical disks instead of RAM. The media 
segments stored on disks, however, needs to be retrieved 
into RAM before it can be transmitted to clients by the CM 
server. Modern magnetic and optical disks, however, have 
limited storage capacity, e.g. 1 GB to 9 GB, and relatively 
low transfer rates for retrieving data from these disks to 
RAM, e.g. 30 Mbps to 60 Mbps. This limited storage 
capacity affects the number of individual media segments 
that can be stored on the CM server and, along with the low 
transfer rates, affects the number of clients that can be 
concurrently serviced. A naive storage scheme in which an 
entire media segment is stored on an arbitrarily-chosen disk 
could result in disks with popular media programming being 
over-burdened with more requests that can be supported, 
while other disks with less popular programs remain idle. 
Such a scheme results in an ineffective utilization of disk 
bandwidth, the term "disk bandwidth" referring to an 
amount of data which can be retrieved from a disk over a 
period of time. When data is not being retrieved from a disk, 
such as when the disk is idle or when a disk head is being 
positioned, disk bandwidth is not being utilized, and is thus 
considered wasted. Ineffective utilization of disk bandwidth 
adversely affects the number of streams a CM server can 
support at the same time. 
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To utilize disk bandwidth more effectively, various 
schemes have been devised where the work load is distrib- 
uted uniformly across multiple disks, i.e., media segments 
are laid out on more than one disk. One popular method for 

5 storing media segments across a plurality of disks is disk 
striping, a well known technique in which consecutive 
logical data units are distributed across a plurality of indi- 
vidually accessible disks in a round-robin fashion. Disk 
striping, in addition to distributing the work load uniformly 

io across disks, also enables multiple concurrent streams of a 
media segment to be supported without having to replicate 
the media segment. Disk striping has two general variations: 
vertical striping and horizontal striping; these will be 
explained in greater detail below. 

15 Outstanding requests for media segments are generally 
serviced by the CM server in the order in which they were 
received, i.e., first-in first-out ("FIFO"). Where the number 
of concurrent requests is less than or not much greater than 
the number of concurrent streams that can be supported by 

20 the server, overall response times to all outstanding requests 
are possible. In VOD environments, however, where the 
number of concurrent requests typically far exceeds the 
number of concurrent streams that can be supported by the 
server, good overall response times are not possible for all 

25 outstanding requests using FIFO. To provide better overall 
response times, VOD environments, such as cable and 
broadcasting companies, have adopted a paradigm known as 
enhanced pay-per-view ("EPPV"). Using the enhanced pay- 
per-view paradigm, CM servers retrieve and transmit media 

30 segment streams to clients at fixed intervals or periods. 
The average response time to fulfill a client's request is 
half of the fixed interval, and the worst case response time 
to fulfill a request is the fixed interval. For example, if a 
media segment is to begin every 3 minutes, the average time 

35 to fulfill a client's request is Vh minutes; the worst ease 
response time is 3 minutes. 

Furthermore, by retrieving popular media segments more 
frequently, and less popular media segment less frequently, 

^ better overall average response times could be achieved. 
Finally, clients can be informed about the periods and the 
exact times at which media segments are offered, therefore 
predictable overall response times can be provided. 
Although a set of media segments is schedulable on a CM 

45 server employing the EPPV paradigm, determining an exact 
schedule for periodic display of media segments can be 
difficult, particularly when the display periods, media seg- 
ment lengths and transfer rates, i.e. time required to transmit 
a media segment or segment, differ. The goal is to schedule 

50 the set of media segments such that the number of streams 
scheduled to be transmitted concurrently does not exceed the 
maximum number of concurrent streams supportable by the 
CM server. The complexity of scheduling media segments in 
an EPPV paradigm increases dramatically as the number of 

55 media segments being scheduled and the number of server 
resources by which the media segments are transmitted 
increases. Accordingly, there is a need for a method and 
apparatus that can effectively schedule media segments 
periodically on a CM server employing the EPPV paradigm. 

60 More specifically, there is a need in the art for a method and 
apparatus that can effectively schedule media segments of 
different popularity and length. 

SUMMARY OF THE INVENTION 

65 To address the above-discussed deficiencies of the prior 
art, the present invention provides various systems and 
methods of scheduling media segments of varying display 
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rate, length and/or retrieval period on at least one clustered, plary scheduling tree being built according to the method 

vertically-striped or horizontally-striped CM database vol- illustrated in FIG. 7; 

ume. With respect to the at least one clustered database FIGS. 9A and 9B illustrate an exemplary scheduling tree 

volume or the at least one vertically-striped database before and after a split therein; 

volume, one method includes the steps of: (1) associating a 5 nG 10 mustI ^ s a flow diagram of a metnod of building 

display value and a normalized bandwidth consumption with a s^^g ^ having equidistant subtasks; 

each of the media segments, (2) sorting the media segments „ . . . .„ t 4 . . 

in a non-increasing order of value density (which may be, . ™S. U £ ^ * us f * » ™™P l *V 

but is not limited to, a ratio of the display value to the in S tree ^th equidistant subtasks being built according to 

normalized bandwidth consumption) to obtain an ordered 10 method Crated ™ ^0. 9; 

list thereof and (3) organizing the media segments into the FIGS - 12Aand 12B illustrate graphical representations of 

at least one database volume in a particular order. This workloads under simulated conditions for the resource 

determined particular order advantageously increases the scheduling systems and methods of the present invention; 

total display value of the media segments, increasing the an d 

ability of the database volume to provide media segments to 15 FIGS. 13A and 13B illustrate further graphical represen- 

more clients based on the segments' popularity and within tations of workloads under simulated conditions for the 

bandwidth constraints. resource scheduling systems and methods of the present 

With respect to the at least one horizontally-striped data- invention, 
base volume, one method includes the steps of: (1) associ- 
ating a display value with each of the media segments, (2) 20 DETAILED DESCRIPTION 
sorting the media segments in a non-increasing order of Referring initially to FIG. 1, illustrated is an EPPV system 
display value to obtain an ordered list thereof and (3) containing the scheduling systems and methods of the 
building a scheduling tree of the media segments, the present invention. The system, generally designated 100, 
scheduling tree having a particular structure. The particular comprises a CMOD server 110 having at least one database 
structure advantageously increases a total display value of 25 volume 120 associated therewith. Media segments (not 
the media segments, increasing, as above, the overall effec- shown) are stored on and retrieved from the database 
tiveness of the database volume. volume 120 by scheduling and control circuitry or software 

For purposes of the present invention, a "volume" is 130 that includes an associator 132, a sorter 134 and an 

defined as a logical storage unit. The "volume" may be all organizer 136 therein for associating values with media 

or part of a single physical disk drive, a cluster of disk 30 segments, sorting the media segments according to methods 

drives, a stripe set or some other arrangement treated as a that will be set forth hereinafter and organizing the database 

logical storage unit. volume 120 or building one or more scheduling trees, 

The foregoing has outlined, rather broadly, embodiments respectively and as appropriate, 
of the present invention so that those skilled in the art may 35 The associator 132, sorter 134 and organizer 136 may be 
better understand the detailed description of the invention embodied as a sequence of instructions executable within 
that follows. Additional embodiments of the invention will general purpose data processing and storage circuitry (not 
be described hereinafter that form the subject of the claims shown) within the CMOD server 110. In alternate advanta- 
of the invention. Those skilled in the art should appreciate geous embodiments, the associator 132, sorter 134 and 
that they can readily use the disclosed conception and ^ organizer 136, in whole or in part, may be replaced by, or 
embodiments as a basis for designing or modifying other combined with, any suitable processing configuration, 
structures for carrying out the same purposes of the present including programmable logic devices, such as program- 
invention. Those skilled in the art should also realize that mable array logic ("PALs") and programmable logic arrays 
such equivalent constructions do not depart from the spirit ("PLAs"), digital signal processors ("DSPs"), field- 
and scope of the invention in its broadest form. 45 programmable gate arrays ("FPGAs"), application-specific 

integrated circuits ("ASICs"), large scale integrated circuits 

BRIEF DESCRIPTION OF THE DRAWINGS ("LSls"), very large scale integrated circuits ("VLSIs") or 

For a more complete understanding of the present ^ >> to form various ^ of circuitry described and 

invention, reference is now made to the following descrip- claimed herein. 

tions taken in conjunction with the accompanying drawings, 50 FIG. 1 further illustrates a plurality of media receivers 140 

in which: (such as personal computers or television sets) that are 

FIG. 1 illustrates an EPPV system containing the sched- C0U P led t0 CMOD **™ U0 - The plurality of media 

uling systems and methods of the present invention; ^Oreoeive selected ones of the media segments 

- * , , n 4 4 , c from the CMOD server 110 and perform (show or play) the 

2A and 2B illustrate schematic diagrams of a ^ nts for ^ bm ^ ^ a ^ intermediate 

representative media segment matrix and a layout of the d ^ ^ qt ^ SMcd M e 

representauve segment matrix on a disk; Network CTSTN „ } (not shown) may be between 

FIG. 3 illustrates a flow diagram of a method of organiz- me CMOD server 110 and the plurality of media receivers 

ing media segments on a disk; 140 to j^t in distributing the media segments. 

FIGS. 4A and 4B illustrate schematic diagrams of vertical 60 Turning now to FIGS. 2A and 2B, illustrated are sche- . 

striping and horizontal striping; mat j c diagrams of a representauve media segment matrix 

FIGS. 5A, 5B, 6A and 6B illustrate a generalized sched- and a layout of the representative matrix on a disk. The 

uling tree structures for simple periodic tasks according to .EPPV service model associates with each segment C f a 

the present invention and a particular scheduling tree struc- retrieval period T, that is the reciprocal of its display 

cure for an exemplary set of tasks; 65 frequency. The retrieval period of media segments are 

FIG. 7 illustrates a flow diagram of a method of building multiples of the round length T, and data for streams are 

a scheduling tree; FIGS. 8A through 8D illustrate an exem- retrieved from volumes into memory in rounds of length T. 
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Each media segment C ( also has a length 1,. (in units of time) 
and a per stream disk bandwidth requirement r ( . (known as 
display rate). The display frequency is determined as a 
function or characteristic of the popularity of the respective 
media segments at a given point in time or over a given 5 
period of time. As one would expect, segment popularity 
tends to change over time, as, for example, new movies are 
introduced and older ones attract less attention. 

The matrix-based allocation scheme illustrated in FIG. 2A 
increases the number of clients that can be serviced under 10 
the EPPV service model by laying data based on the 
knowledge of retrieval periods. The basic idea is to 
distribute, for each segment Q, the starting points for the 
concurrent display phases (retrieval of the media segment 
starting at a given rate) of Q uniformly across its length. 15 
Each such display phase corresponds to a different stream 
servicing (possibly) multiple clients. Conceptually, each 
segment C f is viewed as a matrix 200 consisting of elements 
of length T (in units of time) arranged in columns 210 and 
rows 220. The numbers of columns 210 and rows 220 of the 2 o 
matrix 200 depend upon the length l ( of the media segment 
C f - and its retrieval period T t -. The number of columns 210 is 
min 







Ti 


T 




T 



The first T units of time of the media segment correspond to 
the matrix element in the first row 220 and first column 210, 30 
the second T units of time of the media segment correspond 
to the matrix element in the first row 220 and second column 
210, and so on. 

The matrix 200 is stored on the volume in column-major 
form such that each column is stored contiguously on the 35 
volume. Furthermore, the retrieval of a media segment is 
performed on columns (i.e., one column per round) with 
each column element provided to a different display phase. 

In a clustered CMOD server, each disk is viewed as an 
independent unit. Entire media segments are stored on, and 40 
retrieved from, a single disk; multiple segments can be 
clustered on a single disk. Turning now to FIG. 3, illustrated 
is a flow diagram of a method of organizing media segments 
on the disks of a clustered CMOD server according to the 
present invention. Each media segment Q is assigned a 45 
value 




Furthermore^ ach media segment has a two dimensional size 
vector as described below. Each media segment is associated 
with a value density p ( -. The value density p ( - for media 
segment Q is defined as the ratio of value of Q to the 55 
maximum component of the size vector. 

The method, generally designated 300, begins in a sorting 
step 310 wherein segments in C are sorted in non-increasing 
order of value density to obtain a list L=<Cj . . . , C n > where 
P t - (the value density of Q) is greater than or equal to P /+1 . 60 
Next, in a step 320, load (B y ) and value (By) are initialized 
to zero. Further, B y - is initialized to an empty set for each bin 
(i.e., volume or disk) B y , j-1, . . . , N. 

Next, in steps 330 and 340, an iterative process is under- 
taken wherein By is designated as the first bin (i.e., volume) 65 
such that load (By) plus size (Q) is less than or equal to 1. 
"Size (C,)" is a two dimensional vector having a first 
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component defined in terms of the normalized contribution 
of C t - to the length of a round or, equivalently, C/s normal- 
ized bandwidth consumption and a second component 
defined in terms of C/s normalized storage capacity. 

Next, the load (By) is made equal to the load (By)+size 
(CJ, the value (By) is made equal to the value (By) plus value 
(Q) (defined in terms of the bandwidth Q effectively utilizes 
during a round), R. is made equal to R.U{Q} and L is made 
equal to L-{Cj. Finally, in a step 350, B^, 1=1, ... , n disk 
is made to represent the bins corresponding to the n disk 
largest of values in the final organizing. 

The method may be embodied as a procedure termed 
"PackSegments" set forth in Table I below: 

TABLE 1 



"PACKSEGMENTS" 

Input: A collection of CM segments C ■ {Q, . . . t Cs) and a number 
of disks n^. 

Output: C C C and a packing of C in unit capacity bins. 
(Goal: Maximize 2Q e C value (Q).) 

1. Sort the segments in C in non-increasing order of value to 
obtain a list L -< C v . . . ,0^ > where p t ^ p i+1 . Initialize 
load (Bj) = value(Bj) = 0, Bj) = ft for each bin 

(i.e., disk) B,, . . . ,N. 

2. For each segment Q in L (in that order): 

2.1 Let Bj be the first bin (i.e., disk) such that load (Bj) + 
size(Q)^ 1. 

2.2 Set loadfBj) - load(Bj) + size(Ci), value^ = value(Bj) + 
value(C(), Bj = BjUjCj, and L = L-{C,}. 

3. Let B <i>t 1-1, . . . ^disk be the bins corresponding to the n,,^ 
largest values in the final packing. Return C - LP 1 ** ^B^. (The 
packing of C is defined by the B^'s). 



Turning now to FIGS. 4A and 4B, illustrated are sche- 
matic diagrams of vertical striping and horizontal striping. 
In vertical striping (FIG. 5A), each column 410, 420, 430 of 
a given segment matrix is declustered across all disks 400A, 
400B, 400C of a given CMOD server. This scheme is similar 
to fine-grained striping or RAID-3 data organization, since 
each column of each segment has to be retrieved in parallel 
from all disks (as a unit). "PackSegmentts" is able to 
operate with vertical striping. In this case, the size vector for 
each media segment is one-dimensional and consists of the 
normalized bandwidth requirement (or consumption) for the 
media segment. 

In horizontal striping (FIG. 4B) the columns 450, 460, 
470, 480, 490 of a given segment matrix are mapped to 
individual disks 440 A, 440B, 440C in a round-robin manner. 
Consequently, the retrieval of data for a transmission of C ( - 
proceeds in a round-robin fashion along the disk array. 
During each round, a single disk is used to read a column of 
Q and consecutive rounds employ consecutive disks. 

Consider the periodic retrieval of Q from a specific disk. 
By virtue of the round-robin placement during each trans- 
mission of C t , a column of C ( must be retrieved from that 
disk periodically at intervals of n disk rounds. Furthermore, to 
support EPPV service, the transmissions of C ( are them- 
selves periodic, with a period T,-=n t -T. Thus, the retrieval of 
C, from a specific disk is a collection of periodic real time 
tasks with period T, (i.e., the media segment's transmission), 
where each task consists of a collection of subtasks that are 
n disk T time units apart (i.e., column retrievals within a 
transmission). A simplified version of this problem occurs 
when, for each media segment Q, 1,-^n^-T holds. In this 
case, periodic retrieval of a media segment consists of a 
simple periodic task. 

Turning now to FIGS. 5 A and 5B, illustrated are a 
generalized scheduling tree structure 500 for simple periodic 
tasks, where this task model is applicable to media segments 
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for which l^n^^T holds, according to the present inven- is the objective. The basic idea of the heuristic of one aspect 

tion and a particular scheduling tree structure for an exem- of the present invention (termed BuildTree) is to build the 

plary set of tasks for which l^n^-T (containing nodes 540, scheduling tree incrementally in a greedy fashion, scanning 

550, 560, 570, 580, 590 and edges 542, 544, 552, 554, 562, the tasks in non-increasing order of value and placing each 

582). FIGS. 5A and 5B are presented primarily for the 5 period n ; in that candidate node M that implies the minimum 

purpose of providing an overview of the scheduling tree value ] oss among all possible candidates. This loss is cal- 

structure concept of the present invention. A scheduling tree filiated as the total value of all periods whose candidate sets 

(of me present invention and as described below) determines become empty after the placement of n, under] M. Ties are 

a "conflict-free" schedule for the periodic retrieval of media f lwa ^ b '° ken » favor ° f ih ™ t^lwfl Zl 
4 4 , . f l j . r« . . +U , n located at higher levels (i.e., closer to the leaves), while ties 
segment that are part of the scheduling tree Hiat is, the 10 ^ ^ M are broken using the postorder node 
retrieval of these media segments will not collide in a round. numben , ( . ^.^.^ order) . When a period ^ ^hed- 
One fundamental concept of the present invention is that uled h ^ me candidate node sets for all rema ining periods 
aU tasks in a subtree rooted at some edge 512, 522 emanating are UQdated ( m an incremental fashion) and the method 
from node n (such as a node 510) at level 1 uses time slot continues with the next task/period (with at least one can- 
numbers that are congruent to I(mod Jt a (n)) where I is a 15 didate in 1'). 

unique number between 0 and n x (n)-l. Satisfying this piG. 6 illustrates a flow diagram of a method 600 of 

invariant recursively at every internal node 520, 530 ensures building a scheduling tree for a limited segment model 

the avoidance of conflicts. wherein l f ^T The method 600 may be embodied as a 

An internal node n at level 1 is candidate for period n t procedure termed "BuildTree" set forth in Table II below: 

(n^T/T) if and only if ji / _ 1 (n)|n, and gcd 20 

TABLE II 

i^n^O j- wW -eV). • "BUILDTREE" 

1. Input: A set of simple periodic tasks C = {C lf . . . ,0^} and l t ^ 

A period n, can be scheduled under any candidate node n in 25 ^ with corresponding periods P - {n t . . ,n N }, and a 

. , _ ... ' . 4 value 0 function assigning a value to each Q. 

a scheduhng tree. Two possible cases exist: A trec r for a subset c of c. (Goal: Maximize 

If ^(n))^- then the condition above guarantees that n (in e c value (Q).) 

a tree having a node 600 and edges 610«, 6106, 610c) *»**m in C in non-increasing order of value to obtain 

- , j A ' . ' a list L» <C u O>, . . . ,Cn>, where value (Q) ^ value (C i+l ). 

has at least one free edge 610d at which n, can be 3Q 1^11^ r consists of a root node with a weight equal to 0l . 

placed. 2. For each periodic task Q in L (in that order): 

If Jt,(n) / n- then, to accommodate n, under node n (in a 21 Let cand(n b r> be the set of candidate nodes for rii, in 

, . ■ * , £nn j j c*%t\u rin„ ■ (Note that this set is maintained incrementally as the tree is built.) 

tree having a node 600 and edges 620a, 6206, 620c, ^ Fof cach n £ ^ , et ru{Qi}n den / te the ^ that 

620<*, 630fl, 6306, 630c), n must be Split SO that the results wncn n { is placed under node n in 64 . Let loss(n) - 

defining properties of the scheduling tree structure are 35 {q, e L-{cj| cand^inilJ = 0} 

kept intact. This may be done as follows. Let and ™ l ™ loss ( n )) - 

^C,do.w(n) vOa«(Ci)* 

(2.3 Place nt under the candidate node M such that value 

w(n) \ (loss (M)) - miri^^ (nt ^{valueOoss (n)) }. CTies are broken in 

^ '* IIm(«) / favor of nodes at higher levels.) If necessary, node M is split 

2.4 Set r - PJinjM, l^Hoss(M). 



40 



2.5 For cach task Cj, e L, update the candidate node set 



Node n is split into a parent node with weight d and child cand^-, r) 

nodes with weight ~~ ^— ^ 

With reference to FIG. 6, the method begins in a step 610 

■^■jp- , wherein media segments (tasks) are sorted in a non- 

45 increasing order of value to obtain a list L**<C ly • • > 
C N >. Next, for each periodic task in order, a candidate set of 

with the original children of n divided among the new child nodes is developed (in a step 620, a tree is built iteratively 

nodes; that is, the first batch of (in a step 630), where n, is placed under a selected candidate 

node (in a step 640) and candidate nodes are updated for 

. 50 remaining periods (in a step 650). 

d Let N be the number of tasks in C. The number of internal 

nodes in a scheduling tree is always going to be 0(N). To see 

children of n are placed under the first child node, and so on. this, note that an internal node will always have at least two 

It is apparent that this splitting maintains the properties of children, with the only possible exception being the right- 

the structure. Furthermore, the condition set forth above 55 most one or two new nodes created during the insertion of 

guarantees that the newly created parent node will have at a new period. Since the number of insertions is at most N, 

least one free edge for scheduling n ( -. it follows that the number of internal nodes is O(N). Based 

The set of candidate nodes for each period to be scheduled on this fact, it is easy to show that BuildTree runs in time 

can be maintained efficiently, in an incremental manner. The 0(N 3 ). 

observation here is that when a new period n,. is scheduled, 60 Example 2: Consider the list of periods^ =2, n 2 =12, 

all remaining periods advantageously only have to check a n 3 =30>(sorted in non-increasing order of value). Turning 

maximum of three nodes, namely the two closest ancestors now to FIGS. 7A through 7D, illustrated is the step-by-step 

of the leaf for n, and, if a split occurred, the last child node construction of the scheduling tree (comprising nodes 700, 

created in the split, for possible inclusion or exclusion from 710, 720, 710a, 710b 9 730a, 730b) using BuildTree. Note 

their candidate sets. 65 that period n 3 splits the node with weight 6 into two nodes 

As above, each task is assumed to be associated with a with weights 3 and 2 (the node 720 splits into nodes 720a, 

value and that improving the cumulative value of a schedule 7206). 
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In the general case, when the lengths of the media the equation: 
segments are not restricted, periodic media segment retrieval 

under horizontal striping was defined above as a periodic ' l 

real-time task Q with period Ui ' ancestor_edg Cl („ f ) + ^ ancestor-cdge/*,) ■ n y _ 2 (*,). 

5 

II 

" f ■ t for checking the availability of specific time slots in the 

scheduling tree. The scheduling of Q can then be handled as 
(in rounds) that consists of a collection of follows. Select a candidate node for n, and a time slot u,- for 

io n ( - under this candidate. Place the first subtask of C ( - in u t and 
call the predicate repeatedly to check if n,- can be scheduled 
in slot u, 



/ - 1 --f^l 



subtasks (c ( - being the number of columns in the matrix for 15 
media segment Q) that need to be scheduled n disk rounds 
apart. The basic observation here is that all the subtasks of 

C. are themselves periodic with period n,, so the techniques T£ . , _ n . . _ . . . , . 

of the previous section can be used for each individual If the predicate succeeds for all j then Q is scheduled 

subtask. However, the scheduling method also needs to 2 o s . tarhn S at u «* Otherwise, the method can try another poten- 

ensure that all the subtasks are scheduled together, using tial starting slot u,-. 

time slots (i.e., rounds) placed regularly at intervals of n disk . A problem with the approach outline above is that even if 

Heuristic methods for building a scheduling tree in this me num ber of starting slots tried for Q is restricted to a 

generalized setting will now be set forth in detail. constant, scheduling each subtask individually yields 

An important requirement of this ^more general task model 25 d d ial time comp i e xity. This is because the 

is that the insertion of new periods cannot be allowed to r * * . i- *• • * • i *ni_ 

distort the relative placement of subtasks already in the tree. number o£ scheduling operations in a tnal will be 
The splitting mechanism described in the previous section 

for simple periodic tasks does not satisfy this requirement, 0 J Ci ] ^ 

since it can alter the starting time slots for all subtasks V »** / 
located under the split node. Instead, the present invention 30 

employs a different method for "batching" the children of wnere 
the node being split, so that the starting time slots for all leaf 

nodes remain unchanged. This new splitting rule is as / / \ 

follows: if the node n is split to give a new parent node with c, = min ( — ) 

weight d, then place at edge I of the new node (1=0, . . . , d-1) 35 \ T / 
all the children of the old node n whose parent edge weight 

was congruent to I(mod d). ^ part of me pro blem input. 

Turning now to FIGS. 8 A and 8B, illustrated are an _ . . , , .... 

exemplary scheduling tree (having nodes 800, 810a, 8106) ^ P rescnt "^tion provides a polynomial time heu- 

before and after a split therein using the above-described 40 nstw method for the problem. To simplify the presentation, 

splitting rule of the present invention (and adding a node it is assumed that every period n, is a multiple of n disk . 

810c). Example 3: FIG. 8A illustrates a scheduling tree with Although it is possible to extend the heuristic described 

two tasks with periods n 2 =6, n 2 =6 assigned to slots 0 and 1. herein to handle general periods, it is believed that this 

FIG. 8B depicts the scheduling tree after a third task with assumption is not very restrictive in practice. This is because 

period n 3 =15 is inserted. Although there is enough capacity 45 round lengths T are typically expected to be in the area of a 

for both n a and n 2 in the subtree connected to the root with few seconds and periods T t are typically multiples of some 

edge 0, the splitting rule of the present invention forces n 2 number of minutes (e.g., 5, 10, 30 or 60 minutes). Therefore, 

to be placed in the subtree connected to the root with edge ft & realistic to assume the smallest period in the system can 

1* be selected to be a multiple of n disk . The objective is to 

In this setting, the notion of a candidate node is defined as 50 devise a memod that ensures that if the first subtask of a task 

follows: an internal node n at level 1 is candidate for period c does nol ^ ^ finjt su t> task 0 f other task m 

n f - if and only if ^(nfln, and there exists an I e|{0, , d-1 } th ' e tree> thcn Q0 other c^bi^ti™ of subtasks can cause a 

such that aU edges of n with weights congruent to I (mod d) collisk)n tQ Qccm ^ means ^ Qnce ^ fet gubtask Qf 

are free, where ^ fe m ^ ^^^g tree mere ^ no need to check 

( \ 55 the rest of C/s subtasks individually. 

w (")> n l _ l («) ) ' The method of the present invention sets the weight of the 

root of the scheduling tree to n dUk . (This is possible since the 

However, under the generalized model of periodic tasks n <*' s ™ of 11 ndisk ;) ^ im P lies tl f consecutiv e 

of the present invention, a candidate node for n, can only 60 subtasks of a task ^ rcc l uire consecutive edges emanating 

accommodate a subtask of Q. This is clearly not sufficient from nodes at the level ( direct Ascendents of the root), 

for the entire task. The temporal dependency among the wnich " c first-level ancestors of the leaf nodes where the 

subtasks of C, means that the scheduling tree method of the subtasks are placed. When the first subtask of a task is placed 

present invention should make sure that all the subtasks of at a leaf node, at least some of the consecutive edges of the 

C,. are placed in the tree at distances of n^. 65 first-level ancestor node of that leaf are disabled, so that the 

One way to deal with this situation is to maintain candi- slots under those edges cannot be used by the first subtask 

date nodes for subtasks and use a simple predicate based on of any future task. By the previous observation, 
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consecutive edges of the first-level ancestor of the leaf for n f 
must be disabled, starting with the right neighbor of the edge 
under which that leaf resides. (s ( - is the number of subtasks 
of Q) This "edge disabling" is implemented by maintaining 
an integer distance for each edge e emanating from a 
first-level node that is equal to the number of consecutive 
neighbors of edge e that have been disabled. The placement 
method of the present invention should maintain two invari- 
ants. First, the distance of an edge e of a first-level node is 
always equal to max c {sj-l, where the max is taken over 
all tasks placed under e in the tree. Second, the sum of the 
weight of an edge e of a first-level node n and its distance 
is always less than the weight of n (so that the defining 
properties of the tree are maintained). Based on the above 
method, the notion of a candidate node can be defined as 
follows: let n be an internal node at level 1. Let n ( be a period 
and define 

Node n is candidate for period n,- if and only if Jt / _ 1 (n)|n / and 
the following conditions hold: 

1. If n is the root node, n has a free edge. 

2. If level(n)=l, there exists an I c{0, . . . , d-1} such that 
all (non-disabled) edges of n whose sum of weight plus 
distance is congruent to (I+j) (mod d), for 0^j<s t , are 
free. 

3. If level(n)^2, 

3.1 there exists an I e{0, . . . , d-1} such that all edges 
of n with weight congruent to I (mod d) are free; and, 

3.2 s f — 1 — ancestor_edge 2 (n^ancestor-nodej (n) and 
s £ +ancestor_edge 2 (n) is less than or equal to the 
weight of the (non-disabled) edge following 
ancestor_edge 2 (n), if there is such an edge. 

Note that clause 2 ensures that edge distances are maintained 
when the first-level nodes are split. 

Turning now to FIG. 9, illustrated is a flow diagram of a 
method 900 of building a scheduling tree for periodic tasks 
having equidistant subtasks. The method 900 may be 
embodied as a procedure termed "BuildEquidTree" set 
forth in Table III below: 

TABLE III 

"BUILDEQUIDTREE" 

Input: A set of periodic tasks C = {C lt . . . , CjJ with 
corresponding periods P - {n^ . . . , Dn} a value 0 function 
assigning a value to each C v Each task consists of subtasks 
placed at intervals of n^,^ 

Output: A scheduling tree T for a subset C of C. (Goal: Maximize 
Z^C value (Q).) 

1. Sort the tasks in C in non-increasing order of value to obtain 
a list L - < C l( C^, . . . >, where value (C|) £ 

value (Q +1 )- Initially, T consists of a root node with a weight equal to 

2. For each task Q in L (in that order): 

2.1 Select a candidate node n for n, in I\ (Ties are broken 
in favor of nodes at higher levels). 

2.2 If w(n)|/ni, split n. 

23 Schedule the first subtask of C, under n. (Ties are 
broken in favor of edges with smaller weights). 



TABLE Ill-continued 



"BUILDEQUIDTREE" 

5 2.4 Let d be the distance of the ancestor edge at the first 
level of the leaf corresponding to n,. Set the distance of 
this edge to max{d, s { - 1}. 



BuildEquidTree can be used to construct a scheduling tree 

10 in polynomial time. With reference to FIG. 9, the method 
900 begins in a step 910 wherein the tasks are sorted in a 
non-increasing order of value to obtain a list L=<C a , C 2 , . . . , 
C^>. Next, for each periodic task in order, a candidate node 
n is selected (in a step 920), n is split if w(n) / n, (in a step 

15 930), the first subtask of the task is scheduled under n (in a 
step 940) and edge distances are set (in a step 950). 

Turning now to FIGS. 10A through 10C, illustrated are an 
exemplary scheduling tree (variations of which are desig- 
nated 1000, 1010, 1020) with equidistant subtasks being 

20 built according to the method illustrated in FIG. 10. 
Example 4: Consider three tasks C lf Q, C 3 with s 1( s 2 , s 3 =»2, 
1, 3 and n lf n 2 , n 3 -12, 18, 10 and n^=2. FIGS. 10A through 
10C illustrate the three states of scheduling tree after place- 
ment of C a , C 2 and C 3 , respectively. 

25 An interesting property of the scheduling tree formulation 
is that it can easily be extended to handle time slots that can 
fit more than one subtask (i.e., can allow for some tasks to 
collide). As set forth above, this is exactly the case for the 
rounds of EPPV retrieval under horizontal striping. The 

30 subtasks of C ( - can be thought of as items of size (Q) ^ 1 (i.e., 
the fraction of disk bandwidth required for retrieving one 
column of media segment C x ) that are placed in unit capacity 
time slots. In this more general setting, a time slot can 
accommodate multiple tasks as long as their total size does 

35 not exceed one. 

The problem can be visualized as a collection of unit 
capacity bins (i.e., time slots) located at the leaves of a 
scheduling tree, whose structure determines the eligible bins 
for each task's subtasks (based on their period). With respect 

40 to the previous model of tasks, the main difference is that 
since slots can now accommodate multiple retrievals it is 
possible for a leaf node that is already occupied to be a 
candidate for a period. Hence, the basic idea for extending 
the methods of the present invention to this case is to keep 

45 track of the available slot space at each leaf node and allow 
leaf nodes to be shared by tasks. Thus, the notion of 
candidate nodes can simply be extended as follows: let n be 
a leaf node of a scheduling tree r corresponding to period p. 
Also let S(n) denote the collection of tasks (with period p) 

50 mapped to n. The load of leaf n is defined as: 

load(n)«I Ct ^ (M) size(Q. 

A node n at level 1 is candidate for a task of C ( (with period 
nj if and only if: 

1. n is internal, conditions in the previous definition of 
55 candidate node hold, or 

2. n is external (leaf node) corresponding to n t - (i.e., jt^ 
(n)»ni), and load(n)+size (Q)^l. 

With these extensions, it is easy to see that the Build- 
EquidTree method can be used without modification to 

60 produce a scheduling tree for the multi-task capacity case. 
To construct forests of multiple non-colliding scheduling 
trees, trees already built can be used to restrict task place- 
ment in the tree under, construction. By the Generalized 
Chinese Remainder Theorem, the scheduling method needs 

65 to ensure that each subtask of task Q is assigned a slot u, 
such that u, £u y - (mod gcd (n t -, n y )) for any subtask of any task 
C ; - that is scheduled in slot u y . in a previous tree within the 
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same forest. A general packing-based method set forth The results of the experiments, with type #1 workloads 

below can be used for combining independently built sched- with hot regions of 30% (a graph 1100) and 10% (a graph 

uling forests. Of course, a forest can always consist of a 1110) are shown in FIGS. 11 A and 11B, respectively, 

single tree. The objective is to improve to the utilization of Clearly, the horizontal striping-based method outperforms 

scheduling slots that can accommodate multiple tasks. 5 both clustering and vertical striping over the entire range of 

Given a collection of tasks, scheduling forests are con- values for the number of disks. Observe that for type #1 

structed until each task is assigned a time slot. No pair of workloads, the maximum number of segments that can be 

tasks within a forest will collide at any slot except for tasks scheduled is limited by the aggregate disk storage, 

with the same period that are assigned to the same leaf node Specifically, it is easy to see that the maximum number of 

as described in Section 5.3. A simple conservative approach 10 segments that can fit in a disk is 3.95 the average number of 

is to assume a worst-case collision across forests. That is, the concurrent streams for a segment is (0.3-3+0.7- 1)=1. 6. Thus 

size of a forest F £ is defined as size (F £ )=max n68 Fi (load (n,)) the maximum bandwidth that can be utilized on a single disk 

where n ; . is any leaf node in F t -, and the load of a leaf node f or this mix of accesses is 1.6-3.95-1.5-9.48 Mbps. This 

is as given above. Further, a forest F, has a value: value explains the low scheduled bandwidth output shown in 

(F<)-£ C/ejr ,- value (C) y . Thus, under the assumption of a 15 FIGS. UA and 11B. Note that, in most cases, the scheduling 

worst-case collision, the problem of maximizing the total tree heuristics of the present invention were able to schedule 

scheduled value for a collection of forests is a traditional 0/1 the entire offered workload of segments. On the other hand, 

knapsack optimization problem. A packing-based heuristic the performance of vertical striping methods quickly dete- 

as PackSegments can be used to provide an approximate riorates as the size of the disk array increases, 

solution. 20 The performance of the clustering method of the present 

In some cases, the worst-case collision assumption across invention under Workload #1 suffers from the disk storage 

forests may be unnecessarily restrictive. For example, con- fragmentation due to the large segment sizes. A deterioration 

sider two scheduling trees r 2 and T 2 that are constructed to can a is 0 be observed in the performance of clustering as the 

be independently. Let ej be an edge emanating from the root access skew increases (i.e., the size of the hot region 

node n 2 of T ± and e 2 be an edge emanating from the root 2 s becomes smaller). This can be explained as follows: Pack- 

node n 2 of r 2 . If e a mod (gcd (n lf n^)*e 2 mod (gcd (n, n^) Segments first tries to organize the segments that give the 

holds, then the tasks scheduled in the subtrees rooted at ^ highest profit (i.e., the popular segments). Thus when the hot 

and e 2 can never collide. Using such observations, more region becomes smaller the relative value of the scheduled 

sophisticated packing-based methods for combining forests subset (as compared to the total workload value) decreases, 

can be constructed. 30 The relative performance of the three methods for a type 

Preliminary performance experimentation has been #2 workload with a 50% hot region is depicted in FIG. 12A 

undertaken to compare the average performance of the (a graph 1200). Again, the horizontal striping-based method 

methods introduced in by the present invention for support- outperforms both clustering and vertical striping over the 

ing EPPV service. For the experiments, two basic workload entu - e range 0 f N ote that, compared to type #1 

components were employed, modeling typical scenarios 35 workloads, the relative performance of clustering and ver- 

encountered in today's pay-per-view CMOD media segment tical striping methods under this workload of short segments 

servers. is significantly worse. This is because both these methods, 

Workload #1 consisted of relatively long MPEG-1 com- being unaware of the periodic nature of segment retrieval, 

pressed media segments with a duration between 90 reserve a specific amount of bandwidth for every segment C ( - 

and 120 minutes (e.g., movie features). The display rate 40 during every round of length T. However, for segments 

for all these media segments was equal to r^l.5 Mbps. whose length is relatively small compared to their period, 

To model differences in media segment popularity, the this bandwidth is actually needed only for small fraction of 

workload comprised two distinct regions: a "hot rounds. FIG. 12A clearly demonstrates the devastating 

region" with retrieval periods between 40 and 60 effects of this bandwidth wastage and the need for periodic 

minutes and a "cold region" with periods between 150 45 scheduling methods. 

and 180 minutes. Finally, FIG. 12B depicts (in a graph 1210) the results 
Workload #2 consisted of small media segment segments obtained for a mixed workload consisting of 30% type #1 
with lengths between 2 and 10 minutes (e.g., commer- segments and 70% type #2 segments. Horizontal striping is 
cials or music media segments). The display rates for once again consistently better than vertical striping and 
these media segments varied between 2 and 4 Mbps 50 clustering over the entire range of disk array sizes. Com- 
(i.e., MPEG-1 and 2 compression. Again, segments pared to pure type #1 or #2 workloads, the clustering-based 
were divided between a "hot region" with periods method is able to exploit the non-uniformities in the mixed 
between 20 and 30 minutes and a "cold region" with workload to produce much better packings. This gives 
periods between 40 and 60 minutes. clustering a clear win over vertical striping. Still, its waste- 
Each component was executed in isolation and mixed 55 fulness of disk bandwidth for short segments does not allow 
workloads consisting of mixtures of type #1 and type #2 it to perform at the level of horizontal striping, 
workloads were also investigated. The basic performance In general, the period T ( - of a media segment Q may be 
metric was the effectively scheduled disk bandwidth (in greater than its length 1 ( -. The methods presented above for 
Mbps) for each of the resource scheduling methods intra- clustering and vertical striping can be used to schedule such 
duced by the present invention. Scaleup experiments in 60 media segments, however, they may be unnecessarily 
which the offered load (i.e., number of segments to be restrictive. 

scheduled) was proportionate to the system size (i.e., num- If T ( ->1 ( -, then under clustering and vertical striping, the 

ber of disks in the server) were concentrated upon. Further, retrieval of a media segment C f can be modeled as a 

in all cases, the expected storage requirements of the offered collection of periodic real-time tasks with period T^n^-T, 

load were insured to be approximately equal to the total disk 65 where each task consists of a collection of Q subtasks that 

capacity. This allowed the storage capacity constraint for the are T time units apart and have a computation time equal to 

striping-based methods to be ignored. the column retrieval time. (Q is the number of columns in 
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Q.) Note that the only difference between this task model 3. The system as recited in claim 1 wherein said at least 

and the one defined above is that the distance between one database volume is a set of clustered drives, 

consecutive subtasks is only one time slot (rather than n^j). 4. The system as recited in claim 1 wherein said at least 

The scheduling tree methods and packing-based methods of one database volume is a set of drives employing vertical 

the present invention for combining forests and trees can 5 striping. 

easily be modified to deal with this case. 5. The system as recited in claim 2 wherein said at least 

It has been assumed to this point that segments are stored one database volume is a set of clustered drives, 

on disks using a matrix-based layout scheme. That is, each 6. A method of scheduling media segments of varying 

column of a segment matrix is stored contiguously. A display rate, length and retrieval period on at least one 

column is nothing more than the total amount of data that continuous media database volume, comprising the steps of: 

needs to be retrieved in a round for all concurrent display associating a display value and a normalized bandwidth 

times. Thus, the matrix-based layout provides the advanta- requirement with each of said media segments; 

geous property of reducing the disk latency overhead within sorting said media segments in a non-increasing order of 

a round for all the concurrent phases to a single t lar On the vanie density to obtain an ordered list thereof; and 

other hand, the scheduling and organizing methods of the organizing said media segments into said at least one 

present invention can also handle conventional data layout 15 database volume in an order that increases a total 

methods that do not exploit the knowledge of retrieval display value of said media segments, 

periods during data layout. 7 ^ method as recited in claim 6 wherein said step of 

In addition to supporting EPPV service, the tree-based associating comprises the step of further associating a nor- 

scheduling methods of the present invention can offer sup- ^foed storage capacity with each of said media segments, 

port for the Random Access service model described above, 20 g ^ method ^ redted m daim 6 wnerein said at least 

which places resource reservations to aUocate independent 0Qe database voluffie fa a ^ of driyes 

physical channels to each mamdual CMOD client. Under mM ^ 6 &aid k 

the Random Access service model the maximum number -of is a set of drives employing vertical 

streams that can be concurrently retrieved and, therefore, the F J B 

maximum number of concurrent clients that can be sup- 9S stn P in j|; , , . . „ , 

ported is limited by the available resources. 10 ™* method as recited 10 claim 7 wherem said at least 

From the above, it is apparent that the present invention one database volume is a set of clustered drives, 

provides various systems and methods of scheduling media U. A continuous media system, comprising: 

segments of varying display rate, length and/or retrieval a continuous media-on-demand (CMOD) server having at 

period on at least one clustered, vertically-striped or 3Q least one database volume associated therewith; 

horizontally -striped CM database volume. With respect to a system for scheduling media segments of varying dis- 

the at least one clustered database volume or the at least one p i a y rate, length and retrieval period on said at least one 

vertically-striped database volume, one method includes the database volume, including: 

steps of: (1) associating a display value and a normalized ^ associ^r that associates a display value and a 

bandwidth requirement with each of the media segments, (2) normalized bandwidth requirement with each of said 

sorting the media segments in a non-increasing order of 35 media segments 

value density to obtain an ordered list thereof and (3) _ . „ 4 . * _ ' • , _ i- • „ „ nn 

t J j * t_ a sorter that sorts said media segments m a non- 

orgamzing the media segments into the at least one database . . ^ , „ r.,„i,_ ^ t TZ u * n ;„ ^a^a 

i * j « • . . i * . i , fi , increasing order of value density to obtain an ordered 

volume in an order that increases a total display value of the , & f , J 

media segments. With respect to the at least one list tnereoi ana 

horizontally-striped database volume, one method includes 40 ™ °!f m « r that ot f m ^ saK l media ^ 

the steps of: (1) associating a display value with each of the said at least one database volume in an order that 

media segments, (2) sorting the media segments in a non- increases a total display value; and 

increasing order of value density to obtain an ordered list a plurality of media receivers coupled to said CMOD 

thereof and (3) building a scheduling tree of the media server that receive and perform selected ones of said 

segments, the scheduling tree having a structure that 45 media segments. 

increases a total display value of the media segments. 12. The system as recited in claim 11 wherein said 

Although the present invention has been described in associator further associates a normalized storage capacity 

detail, those skilled in the art should understand that they can with each of said media segments. 

make various changes, substitutions and alterations herein 13. The system as recited in claim 11 wherein said at least 

without departing from the spirit and scope of the invention 50 one database volume is a set of clustered drives, 

in its broadest form. 14. The system as recited in claim 11 wherein said at least 

What is claimed is: one database volume is a set of drives employing vertical 

1. A system for scheduling media segments of varying striping. 

display rate, length and retrieval period on at least one 15. The system as recited in claim 12 wherein said at least 

continuous media database volume, comprising: 55 one database volume is a set of clustered drives. 

an associator that associates a display value and a nor- 16. A system for scheduling media segments of varying 

malized bandwidth requirement with each of said display rate, length and retrieval period on at least one 

media segments; horizontally-striped continuous media database volume, 

a sorter that sorts said media segments in a non-increasing comprising: 

order of value density to obtain an ordered list thereof; 60 an associator that associates a display value with each of 

and said media segments; 

an organizer that organizes said media segments into said a sorter that sorts said media segments in a non-increasing 

at least one database volume in an order that increases order of display value to obtain an ordered list thereof; 

a total display value of said media segments. and 

2. The system as recited in claim 1 wherein said associator 65 an organizer that builds a scheduling tree of said media 
further associates a normalized storage capacity with each of segments, said scheduling tree having a structure that 
said media segments. increases a total display value of said media segments. 
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17. The system as recited in claim 16 wherein said 
scheduling tree schedules simple periodic tasks when peri- 
odic retrieval of each of said media segments consists of a 
simple period task. 

18. The system as recited in claim 16 wherein said 5 
scheduling tree schedules equidistant periodic subtasks 
when periodic retrieval of each of said media segments 
consists of equidistant periodic subtasks. 

19. The system as recited in claim 18 wherein consecutive 
edges of a first-level ancestor node in said scheduling tree 10 
are disabled. 

20. The system as recited in claim 17 wherein said at least 
one database volume is a set of drives employing horizontal 
striping. 

21. The system as recited in claim 18 wherein said at least 15 
one database volume is a set of drives employing horizontal 
striping. 

22. The system as recited in claim 20 wherein said 
organizer builds a plurality of scheduling trees of said media 
segments. 20 

23. The system as recited in claim 21 wherein said 
organizer builds a plurality of scheduling trees of said media 
segments. 

24. A method of scheduling media segments of varying 
display rate, length and retrieval period on at least one 25 
horizontally-striped continuous media database volume, 
comprising the steps of: 

associating a display value with each of said media 
segments; 

sorting said media segments in a non-increasing order of 30 
display value to obtain an ordered list thereof; and 

building a scheduling tree of said media segments, said 
scheduling tree having a structure that increases a total 
display value of said media segments, 35 

25. The system as recited in claim 24 wherein said 
scheduling tree schedules simple periodic tasks when peri- 
odic retrieval of each of said media segments consists of a 
simple period task. 

26. The system as recited in claim 24 wherein said ^ 
scheduling tree schedules equidistant periodic subtasks 
when periodic retrieval of each of said media segments 
consists of equidistant periodic subtasks. 

27. The method as recited in claim 26 wherein consecu- 
tive edges of a first-level ancestor node in said scheduling 45 
tree are disabled. 

28. The method as recited in claim 25 wherein said at least 
one database volume is a set of drives employing horizontal 
striping. 

29. The method as recited in claim 26 wherein said at least 5Q 
one database volume is a set of drives employing horizontal 
striping. 
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30. The method as recited in claim 28 further comprising 
the step of building a plurality of scheduling trees of said 
media segments. 

31. The method as recited in claim 29 further comprising 
the step of building a plurality of scheduling trees of said 
media segments. 

32. A continuous media system, comprising: 

a continuous media-on-demand (CMOD) server having at 
least one horizontally-striped database volume associ- 
ated therewith; 
a system for scheduling media segments of varying dis- 
play rate, length and retrieval period on said at least one 
horizontally-striped database volume, including: 
an associator that associates a display value with each 

of said media segments, 
a sorter that sorts said media segments in a non- 
increasing order of display value to obtain an ordered 
list thereof, and 
an organizer that builds a scheduling tree of said media 
segments, said scheduling tree having a structure that 
increases a total display value of said media seg- 
ments; and 

a plurality of media receivers coupled to said CMOD 
server that receive and perform selected ones of said 
media segments. 

33. The system as recited in claim 32 wherein said 
scheduling tree schedules simple periodic tasks when peri- 
odic retrieval of each of said media segments consists of a 
simple period task. 

34. The system as recited in claim 32 wherein said 
scheduling tree schedules equidistant periodic subtasks 
when periodic retrieval of each of said media segments 
consists of equidistant periodic subtasks. 

35. The system as recited in claim 34 wherein consecutive 
edges of a first-level ancestor node in said scheduling tree 
are disabled. 

36. The system as recited in claim 33 wherein said at least 
one database volume is a set of drives employing horizontal 
striping. 

37. The system as recited in claim 34 wherein said at least 
one database volume is a set of drives employing horizontal 
striping. 

38. The system as recited in claim 36 wherein said 
organizer builds a plurality of scheduling trees of said media 
segments. 

39. The system as recited in claim 37 wherein said 
organizer builds a plurality of scheduling trees of said media 
segments. 

* * * * * 
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